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Abstract 

Consider  k  populations  tt^,  i  =  l,...,k,  where  an  observation 

from  v ^  has  binomial  distribution  with  parameters  N  and  p^  (unknown). 

Let  p.  .  =  max  p..  A  population  n .  with  p.  =  pr.,  is  called  a  best 
1±i±k  3  i  i  Ikj 

population.  We  are  interested  in  selecting  the  best  population.  Let 
p  =  (plf...,pk)  and  let  i  denote  the  index  of  the  selected  population. 
Under  the  loss  function  S.(jD,i)  =  p^^-p^,  this  statistical  selection 
problem  is  studied  via  empirical  Bayes  approach.  y 

Some  selection  rules  based  on  monotone  empirical  Bayes  estimators 
of  the  binomial  parameters  are  proposed.  First,  it  is  shown  that, 
under  the  squared  error  loss,  the  Bayes  risks  of  the  proposed  monotone 
empirical  Bayes  estimators  converge  to  the  related  minimum  Bayes 
risks  with  rates  of  convergence  at  least  of  order  0 (n  -  ) ,  where  n  is 
the  number  of  accumulated  past  experiences  at  hand.  Further,  for 
the  selection  problem,  the  rates  of  convergence  of  the  proposed 
selection  rules  are  shown  to  be  at  least  of  order  0(exp(-cn))  for 
some  c  -■  0 . 
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EMPIRICAL  BAYES  RULES  FOR  SELECTING 
THE  BEST  BINOMIAL  POPULATION 


1.  Introduction 

In  many  situations,  an  experimenter  is  often  confronted  with 
choosing  a  model  which  is  the  best  in  some  sense  among  those  under 
study.  For  example,  consider  k  different  competing  drugs  for  a  certain 
ailment.  We  would  like  to  select  the  best  among  them  in  the 
sense  that  it  has  the  highest  probability  of  success  (cure  of 
the  ailment) .  This  kind  of  binomial  model  occurs  in  many  fields, 
such  as  medicine,  engineering,  and  sociology.  The  problem  of 
selecting  a  binomial  model  associated  with  the  largest  probability 
of  success  was  first  considered  by  Sobel  and  Huyett  (1957)  and 
Gupta  and  Sobel  (1960)  .  The  former  used  the  indifference  zone 
formulation  and  the  latter  studied  the  subset  selection  approach; 
see  Gupta  and  Huang  (1976)  and  Gupta,  Huang  and  Huang  (1976),  and 
Gupta  and  McDonald  (1986)  for  further  variations  in  goals  and 
procedures  for  this  problem. 

Now,  consider  a  situation  in  which  one  will  be  repeatedly 
dealing  with  the  same  selection  problem  independently.  This  will 
be  the  case  with  an  on-going  testing  with  drugs,  for  example. 

In  such  instances,  it  is  reasonable  to  formulate  the  component 
problem  in  the  sequence  as  a  Bayes  decision  problem  with  respect 
to  an  unknown  prior  distribution  on  the  parameter  space,  and  then, 
use  the  accumulated  observations  to  improve  the  decision  rule  at  each 


*•*.  *  ■  ’ ■' ■  r- y->n.  *" -> ^ »aTAI^ i.^  w  .v’.v  v  ■^xthsz? 
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stage.  This  is  the  empirical  Bayes  approach  of  Robbins  (see 
Robbins  (1956  ,  1964  and  1983)).  Many  such  empirical  Bayes  rules 

have  been  shown  to  be  asymptotically  optimal  in  the  sense  that 
the  risk  for  the  nth  decision  problem  converges  to  the  minimum 
Bayes  risk  which  would  have  been  obtained  if  the  prior 
distribution  was  known  and  the  Bayes  rule  with  respect  to  this 
prior  distribution  was  used. 

Empirical  Bayes  rules  have  been  derived  for  subset  selection 
goals  by  Deely  (1965) .  Recently,  Gupta  and  Hsiao  (1983) 

and  Gupta  and  Leu  (1983)  have  studied  empirical  Bayes  rules  for 
selecting  good  populations  with  respect  to  a  standard  or  a 
control,with  the  underlying  distributions  being  uniformly 
distributed.  Gupta  and  Liang  (1984)  studied  empirical  Bayes 
rules  for  selecting  binomial  populations  better  than  a  standard 
or  a  control. 

In  this  paper,  we  obtain  empirical  Bayes  procedures  for 
selecting  the  best  among  k  different  binomial  populations. 

These  rules  are  based  on  monotone  empirical  Bayes  estimators 
of  the  binomial  success  probabilities.  First,  it  is  shown 
that,  under  the  squared  error  loss,  the  Bayes  risks  of  the 
proposed  monotone  empirical  Bayes  estimators  converge  to  the 
related  minimum  Bayes  risks  with  rates  of  convergence  at  least 
oi  order  0(n-''’).  Further,  for  the  selection  problem,  the  rates 
of  convergence  of  the:  proposed  selection  rules  are  shown  to 


be  at  least  of  order  0(exp(-cn))  for  some  c  >  0. 


2.  Formulation  ~f  the  Empirical  Saves  Approach 

Consider  1:  binomial  populations  i  =  l,...,k,  each 

consisting  of  N  trials.  For  each  i,  i  =  l,...,k,  let  p^^  be  the 
probability  cf  success  for  each  trial  in  sr^,  and  let  X^  denote  the 
number  of  successes  among  the  associated  N  trials.  Then,  XA  |p^ 

is  binomially  distributed  with  probability  function  f^txlp^)  * 

,N  ,  x .  N  -  x  k 

p.  1(l-p  )  1,  x  *  0,  1,  ....  H.  Let  f  ( x  Ip)  =  17  f  .(x  .|p  .) 

lxi  J  i  *  *  **  i  = 

where  x  =  ( x^,  .  .  .  ,  )  and  g  =  <p,,...,p.  ).  For  each  g,  let 

PC13  s  •  ••  ^  P[k]  be  the  ordered  parameters  of  p^, . . . , p^.  It  is 

assumed  that  the  exact  matching  between  the  ordered  and  the 

unordered  parameters  is  unknown.  Any  population  with 

p^  =  *s  considered  as  the  best  population.  Our  goal  is  to 

derive  empirical  Bayes  rules  to  select  the  best  population. 

Let  0  =  {g|g  =  (p1,...,pk>,  p±  e  <0,1>,  i  =  1 . k)  be  the 


k 

parameter  space  and  G(g)  =  17  G,  (p.  )  be  the  prior  distribution 

i  =  1  1 

over  0.  Let  A  -  (i|i  a  l,...,k>  be  the  action  space.  When 
action  i  is  taken,  it  means  that  population  rr^  is  selected  as  the 
best  population.  For  the  parameter  g  and  action  i,  the  loss 
function  $<g, i)  is  defined  as: 


(2.  1 ) 


*  P[kj  "  pr 


the  difference  between  the  best  and  the  selected  population. 


Let  X  =  TJ  (0,1,...,  NJ  be  the  sample  space.  t.  selection 
1  =  1 

rule  d  =  <d^,...,dk>  is  a  mapping'  from  3t  to  [0,1  ]k  such  that  for 
each  observation  x  =  (x1,...,xk>,  the  function  d(x)  = 

(d  (x),  .  .  .  ,  d.  <x)  )  satisfies  that  0  <  d.  (x)  <1,  i  =  i,  ....  k,  and 

I'V  K  V  4.^  w  r  I 

k 

£  d  ( x  >  =  1.  Note  that  d.(x),  i  =  l,...,k,  is 
i  =  l  '  1 

the  probability  of  selecting  population  tr  as  the  best  population 
when  x  is  observed. 

Let  2)  =  { d  | d  :  X  -*  [0,13  ,  being  measurable)  be  the  set  of 

all  selection  rules.  For  each  d  e  23,  let  r(G,  d)  denote  the 

associated  Eaves  risk.  Then,  r<G>  =  inf  r(G,  d)  the  Minimum 

dess 

Bayes  risk. 

From  (2.1),  the  Bayes  risk  associated  with  selection  rule  d 


(G,d)  =  Y  d(x) )f (x  |g)dG(p) 


n  xex 


(2.  2) 


=  c  -  l  [2  V*>w]*-v>. 


xeX  i=l 


k  W  (x) 

where  f(x>  =  TT  f.(x.  ),  f,(x)  =  , - 

1=1^  i  i  f^lx) 


fi(x)  =  fitx|p)dGi(p),  Wi ( x )  =  J  pfi(x 


«  If  > dG  <  P  > 


c  =  I  I  PCk  Jdt5<£  l£)f  (*  >-  being  a  constant, 


x£X  Q 


and  G(p|x)  is  the  posterior  distribution  of  p  given  x. 
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For  each  g  e  3C,  let 


(2.  3) 


A(x)  =  *  max  t  <x.)>. 

l£j£k  J  J 


Thus,  a  randomized  Bayes  rule  is 
dG  =  <diG - 'dkG5'  where 


(2.  4) 


diG(*> 


I A  <  x )  | _1,  if  i 
.0  other 


e  A  ( x )  ; 

r 


thervise ; 


and  |A|  denotes  the  size  of  the  set  A. 

When  the  prior  distribution  G  is  unknown,  it  is  impossibli:- 
to  apply  the  Bayes  rules.  In  this  case,  we  use  the 

empirical  Bayes  approach.  Note 

that,  lor  each  i,  >  is  the  posterior  mean  of  the  binomial 

probability  pA  given  that  X^^  =  xi  is  observed.  Due  to  the 
surprising  quirk  that  ^(x^)  can  not  be  consistently  estimated  in 
the  usual  empirical  Bayes  sense  (see  Robbins  (1964),  Samuel 
(1963)  and  Vardeman  (1978)),  we  use  below  an  idea  of  Robbins  in 
setting  up  the  empirical  Bayes  framework  for  our  selection  problem 


For  each  i,  i  =  1, . . . , k,  at  stage  j,  consider  N  +  l  trials 
from  n^.  Let  X^j  and  Y^j,  respectively,  stand  for  the  number  of 
successes  in  the  first  N  trials  and  the  last  trial.  Let  P^j 
stand  for  the  probability  of  success  for  each  of  the  N*1  trials. 
P±j  has  distribution  G^.  Conditional  on  P^j  =  P±y 

Xij,piJ  "  B(N'Pij>*  Yij,pij  "  and  X±  ^  I  P±  j  and  YijlPij 

are  independent.  Let  Zj  *  ^ (X1 j' Y1 j } ‘  ( Xk J' Yk J 5 J  denote  the 

observations  at  the  Jth  stage,  J  »  1 . n.  We  also  let  +  1  =  }( 

=  (X,,...,X,  )  denote  the  present  observations. 


Consider  an  empirical  Saves  selection  rule  d  (x: 

n  ~ 


*1' - ?n5  =  (dln(-;  ~1'  ‘  •  •  '  ^n)f  •  •  •  *  dkn(-!  Let 

r(G,d  )  be  the  Bayes  risk  associated  with  the  selection  rule 
n 

d  (x;  Z,,...,Z  >.  Then, 

(2.5)  r(G,dn>  =  £  E  J  *<£,dn<x;  Z^,  .  .  .  ,  zn  )>  f  (  x  |g )  dG  (  g  ) , 

xex  n 

where  the  expectation  is  taken  with  respect  to  (Z,,...,Z  ).  For 

~  1  ~  n 

simplicity,  d  <x;  Z,,...,Z  )  will  be  denoted  by  d  (x). 

n  -  - 1  ~n  ’  n  ~ 

Definition  2.  1.  A  sequence  of  selection  rules  (d  }°°  .  is  said  to 

n  n  =  1 

be  asymptotically  optimal  relative  to  the  prior  distribution  G  if 

r  ( G,  d  )  -*  r(G)  as  n  -*  ». 
n 

From  (2.4),  a  natural  empirical  Bayes  selection  rule  can  be 
defined  as  follows: 

For  each  1  =  1,  .  .  .  ,  k,  and  n  =  1,2,...,  let  if.  (x)  s  i> .  (x; 

m  in 

(X  .,Y,  ),...,(X  ,  Y  ))  be  an  estimator  of  P,  (  x  > .  Let  A  (x)  = 

l l  l l  in  in  i  n  ~ 

{ i  I Y* .  )  =  max  f.  (x.)>,  and  define  d  (x)  =  (d,  (x),..., 

in  1  l<j<k  Jn  J  n  ^  ln  ~ 

d,  (  x  )  )  rhere 
kn  v 


(2.6) 


d  (  x  )  = 
in  ~ 


|  A  (  x  )  | 

n  ~ 


if  i  e  A  ( x  >  ; 

n  ~ 

otherwise. 


If  f  (x) 
ln 


f  ,(x)  for  all  x  =  0,1,...,N  and  i  =  1,  .  .  .  ,  k 


(whore  "  p  "  means  convergence  in  probability)  , then,  by  the 

boundedness  of  the  loss  function  £<p, i)  and  Corollary  2  of 

Robbins  '1964),  it  follows  that  r  ( D,  d  )  ->  r(G)  as  n  -t  «>.  Thus,  the 

n 


sequence  cf  selection  rules  (d  )  .  defined  in  (2.6)  is 

n  n  =  1 

asymptc t ically  optimal.  Hence,  our  task  is  only  to 


find  the  sequence  of  estimators  (x>)  possessing  the  above 

mentioned  convergence  property. 

3.  The  Proposed  Empirical  Baves  Selection  Rules 

Before  we  go  further  to  construct  empirical  Bayes  estimators 

<x)),  we  first  investigate  some  property  related  to  the  Baye3 

rule  d_  defined  in  (2.4). 

□ 

Definition  3.  1.  A  selection  rule  d  =  (d^,...,dj?)  is  said  to  be 
monotone  if  for  each  i  =  l,...,k,  d^(x>  is  increasing  in  x^^  while 
all  other  variables  Xj  are  fixed,  and  decreasing  in  x^  for  each 
j  *■  i  while  all  other  variables  are  fixed. 

Note  that  f^tx)  is  the  Bayes  estimator  of  the  binomial 
parameter  p.^  under  the  squared  error  loss  given  that  Xi  =  x  is 
observed.  It  is  also  easy  to  see  that  f^x)  is  increasing  in  x 
for  x  =  0,  1,  ...»  N. 

Definition  3.2.  An  estimator  ?(• )  is  called  a  monotone  estimator 
if  f  <x>  is  an  increasing  function  of  x. 

By  the  monotone  property  of  the  Bayes  estimators  P  (  x  5 , 
i  -  1, . . . , k,  one  can  see  that  the  Bayes  selection  rule  dQ  is  a 
monotone  selection  rule. 

Under  the  squared  error  loss,  the  problem  of  estimating  the 
binomial  parameter  p^  is  a  monotone  estimation  problem.  By 
Theorem  B. 7  of  Berger  (1980),  for  a  monotone  estimation  problem, 
the  class  of  monotone  decision  rules  form  an  essentially  complete 
class.  With  this  consideration,  it  is  reasonable  to  require  that 


the  concerned  estimators  (f,.,<x)}  possess  the  above 

in 

monotone  property. 


n  the  literature,  Robbins  <1956)  and  Vardeman  (lf70>,  dm 


others,  proposed  some  estimators  for  f^(x>.  Those  estimators 
are  consistent  in  that  they  converge  to  ^ (x)  in 
probability.  However,  they  do  not  possess  the  monotone  proper 
vie  now  propose  some  monotone  estimators. 

For  each  i  =  1 ,  .  .  .  ,  k ,  n  =  1,2,...,  and  x  =  0,1,  .  .  .  ,  N ,  ~  :■  i 


i  3.  1  ) 


f ,  < 
in 


x)  =  -  Y  I,  , ( X .  . )  +  n"1; 
n  L  !  x )  i  j 

J  =  1 


(3.2)  W .  <x)  =  -  Y  Y  I  .(X,.>  ♦  n"1; 

in  n  L  ij(x)ij 

j  =  l 

where  I  (•  )  denotes  the  indicator  function  of  the  set  A.  Al;:o 

A 

let  for  each  i  =  1, . . . , k  and  J  =  1,2, ...  . 

Del  me 


13-3’  5i„<*>  '  {[stMt  I  Ilx.n<viJ>Hff  l  hxi'Xij1]}  * 

J=1  j=l 


where  a  A  b  -  min  (a,  b).  Let 


(2.  -i 


¥ .  ( x  >  =  W  <x)/f.  (x); 

in  in  in 


(3.5) 


¥ ,  (x)  =  W  (x)/f,  (x>; 

in  in  in 


and,  lor  each  0  <  x  <  N,  define 


a 

y  =  s 
t 

¥*  (x)  =  max  min  j  V 

ln  0<s<x  e<t<N  *■  L 

y-s 


:  'i.  6)  ¥  (x)  =  max  min 

xn  0<s<x  s<t<N 


(  3.  7  > 


?in<y>/(t-s*i> 


fin<y>/<t-sM  ) 


}> 

}• 
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Note  that  by  (3.6)  and  (3.7),  both  f .  (::>  and  f  ( :: )  are 

in  in 

increasing  in  x.  We  propose  **n(x)  ( or  ^inlx>>  as  an  estimator  of 


f ^ <  x  > .  Let 


<3.  8) 


(3.  9) 


A  (x)  =  ( i  <x  )  =  max  f.  (x,)); 

n  ~  'ini  ,  _  ,  ,  jn  j 

l<j<k  J  J 

A*(x)  =  (i|?*  <x.)  =  max  f*  (x.)l. 
n  -  1  in  i  k  jn  J 


Two  selection  rules  d*  =  (d*  , . . . , d*  >  and  d*  =  (d*  ,  .  .  .  ,  3*  ) 

n  In  kn  n  In  kn 

analogous  to  the  Bayes  selection  rule  d_  are  proposed  as  follows: 


For  each  i  =  1, ...»  k,  let 


(3.  10) 


d  (  x  )  = 

in  ~ 


|  A*  <  x )  i"1 
1  n  ~ 


(3. 11) 


~  • 

d  (x)  = 
in  ~ 


#  -  1 
|  A  (X)  } 
n  ~ 


if  i  e  a  (x) ; 

n  ~ 

otherwise ; 


if  i  6  A  ( x ) ; 

n  - 

otherwise. 


Due  to  the  monotone  property  of  the  estimators  { f ^  ( x  > ; 

~  « 

i  =  l,...,k)  and  CP^(x^);  i  =  l,...,k),  one  can  see  that 

*  -  *  . 
d  and  d  are  both  monotone  selection  rules, 
n  n 


4.  Asymptotic  Optimality  of  the  Monotone  Estimators 


In  this  section,  we 


study  the  asymptotic  optimality 


property  of  the  estimators  t.  <x)  and  f,  (x>.  Under  the  squared 

in  in 

error  loss,  (x)  is  the  Bayes  estimator  of  p^.  The  associated 
Bayes  risk  is 


(4.1) 


R. (G. )  =  EC (P1  -  f . (X. ) )  J. 


Let  »,i  (  ■  )  be  any  estimator  of  with  the  associated  Bayes 


risk  R^(G^,  <|/^)  .  Then, 


(4.  2) 


R .  (G . ,  y .  )  -  R.tG.,)  =  EKfJX.)  -  fJX,))-1]. 
ill  ii  ii  ii 


Let  { t .  (x;  ( X. ,, Y, X.  ,Y.  >>  =  t .  (x))  be  a  sequence 

in  il  il  in  in  in 


of  empirical  Baye3  estimators  based  on  (x;  (X^^Y^), 


iX,  , Y  ,  )  )  . 

in  in 


Definition  4.  1.  A  sequence  of  empirical  Bayes  estimators 


{  r  >  .is  said  to  be  asymptotically  optimal  at  least  of  order 
in  n=l 


a  relative  to  the  prior  G.  if  R. (G,,  t.  )  -  R. (G. )  <  0  ( or  )  as 

n  r  i  i  i  in  11  n 


n  n  ®  where  { )  is  a  sequence  of  positive  values  satisfying 


11m  a  =  0. 
n 

n-*» 


Theorem  4.  1.  Let  {p*^)  and  (?*n>  b®  the  sequences  of  epirical 


Bayes  estimators  defined  in  (3.6)  and  (3.7),  respectively.  Then, 


R, (G ,,?*  )  -  R . ( G . )  <  D(n~1 ) 
i  i  in  i  i 


and 


R, (G . , ?*  )  -  R . ( G  , )  <  Dtn'1 ) . 

i  i  in  i  i 


The  following  lemmas  are  useful  in  presenting  a  concise  proof  of 
Theorem  4.1. 

Lemma  4.1.  Let  Z  be  a  random  variable  and  z  be  a  real  number 
such  that  -<»  <  a  <  Z,  z  s  b  <  <».  Then,  for  any  s  >  0, 

b-z 


z-  a 


EC  I Z - z  I  ] 


=  1 


sts_1P(Z-z  <  - 1 )  dt  ♦ 


j  stS'1P(Z-z  >  t ) dt. 


0  0 
provided  that  the  expectation  exists. 


Proo  f : 


Straightforward  computation. 


X 

Ply>in(x)  "  fi(x)  >  t}  ^  £  P  ( f±n  <  y  >  -  ^(y)  >  t>; 

y  =  0 

N 

PIP*n<x)  -  <x)  <  -t)  S  £  P { TPin < y >  -  f^(y)  <  -t>. 

ysx 

Proof:  Parts  a)  and  b)  are  straightforward  from 

0.6).  Part  c)  is  a  result  of  parts  a>  and  b)  and  an  application 
of  Bcnferroni's  inequality. 

Remark  4.  1.  Lemma  4.  2  is  also  true  if  and  IP*  are  replaced 

in  in 

^  *v  f 

by  and  f  ,  respectively. 

Lemma  4. 3.  For  0  <  t  <  l-f^Cx)  and  0  ^  y  i  x, 

a)  P{**ln(y  >  “  ?^<y>  >  t)  <i  exp  { -2na^  ( t,  y,  n,  i  )  >  ;  and 

b)  P<?in<y>  -  *>1<y)  >  t}  s  exp  { -  ^  a^(t,  y,  n,  i)  ), 

if  t  >  b(n,y,i),  where  b(n,y,i>  =  ( 1-f  1  (  y )  )  n-1/ <  f  <  y  )  +n_1  )  and 
a1(t,y,n,i>  =  tlf±(y)  ♦  n  1  >  -  n~ 1  ( 1  -f  <  y  >  ) . 

For  0  <  t  <  iP^tx)  and  x  £  y  £  N, 

c>  ”  *.,<y>  <  -t)  £  exp( -2na^  <  t,  y,  n,  i  )  )  ;  and 


d)  P(?.  (y>  -  P,(y)  <  -t)  <  2  exp{-  a^(  t,  y,  n,  i )  ) ,  where 

in  l  £  £ 

a^(t,y,n,i)  =  -t(f^(y>  ♦  n  *  )  -  n  *(l-f^(y)>. 

Proof:  Here  we  prove  part  a)  only.  Other  parts  follow  by 

a  similar  reasoning. 

For  0  <  t  <  l-f^x)  and  0  <  y  <  x,  by  (3.1),  (3.2),  (3.4) 

and  the  fact  that  P^(y)  =  W  (x)/f^(x),  following  a  straight¬ 
forward  computation,  one  can  obtain 

p<fin<y)  -  P±(y)  >  t) 


(4.  3) 


=  P(W  (y)  -  <  TP±  (  y  )  ♦  t )  f  in  (  y  )  >  0) 


I  I,y)<Xlj,tYlj  -  fi'r'  -  tJ  * 


tf±  (y  >  >  ( t,  y,  n,  i  )  j. 


Note  that  I{y^  >  ry^j  -  t^(y>  -  t3,  j  =  1,2  ,...,n  are  i.i.d., 

<  y  >  -  t  s  I(y)  <XiJ)  C  Y±  j  -  f1(y)  -  t]  «s  1  -  f±(y)  -  t  for  all 
J,  and  ECI{y} ( X±  ) [Y± ^  -  f±( y)  -  1 3  3  =  -tfi<y).  Also, 
a1(t,y,  n,  i)  >  0  iff  t  >  b(n,  y,  i).  Hence,  by  (4.3)  and  Theorem  2 
of  Hoeffding  (1963),  P(P  (y)  -  P^(y)  >  t)  <  exp( -2na^< t, y, n, i ) ) 
if  t  >  b  (  n,  y,  i  ) . 

Remark  4. 2.  Lemma  4. 3  is  still  true  if  the  strict  inequality 
'-  (  >  )  is  replaced  by  _<  (  ^  )  . 

Lemma  4.  4.  For  0  <  y  <  x, 

1-f t(x) 

a)  J  tP<*>ln(y)  -  f’i<y)  >  t)dt  <  0(n_1>;  and 


i-y>i<x) 

!  tPI?,  (y)  -  P .  ( y  >  >  t ) dt  <  0 ( n~ 1 ) . 


For  x  £  y  £  N 


l^tx) 

c>  J  tP(?*ln(x)  -  #,i<y>  <  - 1  >  dt  <  CKn"1);  and 
0 

K>i(x) 

d)  J  tP  { ?ln  ( y  )  -  1f>±  <  y  >  <  - 1 )  dt  <  0<n_1). 

0 

Proof:  We  prove  part  a)  only. 

Case  1.  As  b(n,y,i)  >  1  -  P^x),  then 

l-f1<x> 

f  tP<Pin<y>  -  y>i  <  y  >  >  t )  dt 

0 

b  <  n,  y,  i  ) 

<|  t  « 

0 

=  b2(n, y, i > /2 
=  0(n“2>. 

Case  2.  As  b(n,y,i>  <  1  -  f^(x>,  then,  by  Lemma  4.3. a) 
direct  computation, 

1-f^x) 

J  tP  ( f  fn  <  y  )  -  <>1<y)  >  t )  dt 
0 

b  (n,  y,  i )  1-P^ ( x  > 

<;  J  t  dt  ♦  J  tP { f in ( y >  -  fi<y) 

0  b( n,  y,  1  > 

£  0<n'2>  ♦  0(n_1) 

-1. 


and  a 


>  t )  dt 


[■J 
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Proof  of  Theorem  4.1. 


By  (4.2), 


(4.4) 


o  <  VCV^)  -  Ri(G1) 


=  EC  (Y>*  (X)  -  P4  (X)  )2] 
in  i 


S  E  C  (  P  *  (X)  -  P„(X))2|X  =  x)f,(x). 
L  in  1  i 


By  Lemmas  4. 1  ~  4. 3  and  the  fact  that  0  <  P^n(x>,  P^<x)  < 


1,  one  can  obtain  that 


(4.  5) 


EC  <P*  (X)  -  P,  (X) )2  |X  =  x] 
in  i 

P  (x) 

A  i  u 


2tP ( P .  (x)  -  P, (x)  <  - t ) dt 

in  1 


1 - P±  <  x ) 

2tP(P*  (x)  -  P, (x)  >  t )dt 
in  i 


N  P  (x> 

__  A  ^ 


<  ^  J  2tP(Pln(y)  -  Pi(y)  <  - t ) dt 

y  =  x  0 


x  1-f  (x) 

■— .  r> 


Y  J  2tP(Pin<y)  -  P±(y)  >  t ) dt 
y  =  0  0 


Then,  by  Lemma  4.4,  (4.4),  (4.5>  and  the  fact  that  N  is  a 

*  *“  1 

finite  number,  therefore,  R,(G.,P,  )  -  R.(G.  )  <  0(n  >. 

'  i  i  in  i  i 

The  similar  claim  for  <p*n  is  established  on  the  same  lines, 


-•vf* 


*  •/  %’  *  *  •  •  ‘o  »  "  .  •  •  ** .  V- 
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5.  Asymptotic  Optimality  of  the  Selection  Rules 
00 

Let  (dn>n_j  be  a  sequence  of  empirical  Bayes  selection  rules 

relative  to  the  prior  distribution  G.  Since  the  Bayes  rule  dQ 

achieves  the  minimum  Bayes  risk  r(G),  r(G,  d  )  -  r(G>  >  0  for  all 

n 

n  =  1,2, ...  .  Thus,  the  nonnegative  difference  r<G, d  )  -  r(G>  is 

n 

used  as  a  measure  of  the  optimality  of  the  sequence  of  empirical 

OO 

Bayes  rules  (d  )  . . 

n  n  =  l 

Definition  5. 1.  The  sequence  of  empirical  Bayes  rules  (d  )*  is 

n  n  =  l 

said  to  be  asymptotically  optimal  at  least  of  order  B  relative 

n 

to  the  prior  G  if  r(G,  d  )  -  r(G)  0 ( fl  )  as  n  -»  •*  where  (0  >  is  a 

n  n  'n 

sequence  of  positive  numbers  such  that  lim  p  =0. 

n-t«>  n 

For  each  x  e  X,  let  A(x)  be  that  defined  in  (2.3)  and  let 

B  <  x )  =  ( 1,  .  .  .  ,  k)  -  A  ( x ) .  Thus,  for  each  x  e  X,  f  .  (x.  )  >  ^(x,) 

"  "  i  i  J  J 

for  i  e  A  <  x )  and  J  e  B(x).  Let  fc  =  min  (f  (x,>  -  ?.(x.)| 

xex  J  J 

-v 

i  e  A(x),  j  e  B(x>).  Hence,  a  >  0  since  X  is  a  finite  space. 
Then, 

0  <  r  ( G,  d*  )  -  r  ( G ) 
n 

(5.1)  <  T  P(  max  (x.)  <  max  (x.)> 

Jx  i«<*>  ln  1  JcB ( x )  J"  J 

5  I  I  I  Ptfin<xi>  -  fJn<xJ>K 

xex  ieA(x)  JeB(x) 

*v  <v  -  -v 


now, 


pop* 

in 


( x 


<5.  2) 


for  each  x  e  X,  i  e  A(x),  j  e  B(x), 

*St  r  -  *V 

*  fJn(xJM 

=  PUP*  (x,  >-P.  (x,  )  1-CP*  (x.)-f.(x.)] 
in  1  i  l  Jn  J  J  J 

£  PUP^^  >-y>i(x1>  3-tP*n<Xj)-Pj  (xj)  3 


<  f  <Xj  )  (x±  )  ) 

<  -£) 


<  P(P*  tx. >-P. <x. >  <  -e./2>  ♦  P<P*  (x.)-P.(x.)  >  t/2>  . 
iniii  Jn  J  J  J 

In  (5.2),  the  first  inequality  is  due  to  the  definition  of  t. 

From  (2.3),  it  suffices  to  consider  the  asymptotic  behavior  of 

the  probabilities  POP*.  (x.)-P,<x.)  >  t/2>  and  P(P*  (x  >-P  (x  ) 

Jn  J  J  J  iniii 

5  -e/2) . 

2  2 

Let  c.  =  min  min  It  £.<y)/2).  Then  c.  >  0.  From  the 
l<i<k  0<y<H 

definitions  of  &  and  b<n,  y,  i),  we  see  that,  for 

sufficiently  large  n,  t  >  2  max  max  (b(n,y,  i>).  Therefore,  by 

l<iSk  0<y<N 

Lemma  4.2  c>  and  remark  4.2,  for  n  large  enough, 

P<P*  <  x ,  )  -  P, <x,  )  >  fc/2> 
in  i  i  i 


<  2  P{fin<y>  “  -  fc/2> 

y  =  0 

Xi 

<  ^  exp { -2na^ ( t/2, y, n, i > ) 

y*o 


£  0(  exp ( -c^n  >  ) . 


The  last  step  of  (5.3)  follows  from  the  fact  that 

exp{-2na^(t,y,n,i)  }  <_  CKexpl-c^))  for  all  0  _<  y  £  N  and  1  <  i  <  k, 
which  is  established  easily  by  a  straightforward  computation  and 


definitions  of  a ^ ( £/2 ,y , n , i)  and  c^. 


VXi>  £  - 1/2 > 


P<'in(V  - 


(5.  4) 


£  £  exp { Ona^ < fc/2, y, n, i ) } 

y*xt 


£  0(exp( -c^n ) ) . 

Therefore,  from  (5.1)  to  (5.4),  and  the  finiteness  of  the 
space  X,  we  have 


0  5  r<G,d*)  -  r(G>  £  0 < exp < -c^n ) ) . 

Similarly,  for  the  sequence  of  empirical  Bayes  selection 

rules  {?*>",,  we  can  prove  that  0  <  r(G,d  )  -  r(G)  <  0(exp(-c_n 
n  n* l  —  n  — 

for  some  c 2  >  0. 


We  now  state  these  results  as  a  theorem. 

ll  oo  <v«  od 

Theorem  5.  1.  Let  (d  )  .  and  (d  )  .be  the  seauences  of 

-  n  n3l  n  n*l 

empirical  Bayes  selection  rules  defined  in  0.10)  and  (3.11), 

respectively.  Then, 

r(G,d*>  -  r(G>  S  0 < exp ( -c , n ) ) , 
n  l 

and 

r(G,d*)  -  r(G)  £  0(exp(-C2n>) 
for  some  c^  >  0,  i  *  1,  2. 
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